This is a project using Python 3.7 developed by Felipe Solares da Silva. This is part of his professional portfolio and if you want to see more projects like this, go and check my portfolio at https://github.com/fsolares/professional-portfolio.
Contact: solares.fs@gmail.com
To this project, we're going to use only three libraries: geopandas, pandas and folium. Pandas is an old friend for all Data Scientist, so I'm assuming that you already have it installed in your machine. To install the other packages, just run the code below.
!pip install folium
!pip install geopandas
It was a pretty good exercise to try to install geopandas on Windows OS. If you're using this OS as well and for some reason, stumble in the same rocks that I did, follow the steps in this wonderful tutorial: https://geoffboeing.com/2014/09/using-geopandas-windows/. So, if you followed all procedures above, you are ready to proceed to the next cell.
import geopandas as gpd
import pandas as pd
import folium
from folium.map import *
from folium import plugins
from folium.plugins import MeasureControl
from folium.plugins import FloatImage
from branca.colormap import LinearColormap
For this project, we're going to use two main files.
First, a geojson file downloaded from EXPLORATORY site (https://exploratory.io/map). Geojson is based on the JavaScript Object Notation (JSON) and it is used to encoding a variety of geographic data structures.The features include points (therefore addresses and locations), line strings (therefore streets, highways and boundaries), polygons (countries, provinces, tracts of land), and multi-part collections of these types. Here is an example of a geojson structure:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
[100.0, 1.0], [100.0, 0.0]
]
]
}
}
And second, a CSV file created from Brazilian Health Ministry Site (https://covid.saude.gov.br/) data. The site compiles all reported information from all brazilian states such as: incidence, confirmed cases, confirmed deaths and mortality. They provide a CSV file, daily updated, since the first COVID occurrence. After lots of cleaning and transforming, we structure the data and store it into a new CSV file (that you can find in this repository!) for our futher analysis.
Let's use the read_file function, from geopandas package, to load the br_states.geojson file and then transform the data.
geobr = gpd.read_file('br_states.geojson')
# Deleting columns.
del geobr['id']
del geobr['regiao_id']
del geobr['codigo_ibg']
# Renaming Columns.
geobr.columns = ['state', 'initials', 'geometry']
# Checking the data.
geobr.head()
Now, let's drop some unnecessary columns to ease our future analysis and load BRnCov19_10052020.csv using read_csv function.
This data set contains cummulative information about brazilian ocurrences from day one till May/10.
So, the goal here is to extract May/10 portion from the data and prepare it for merging.
sus = pd.read_csv('../SUS_csv/BRnCov19_10052020.csv', sep=';'
, usecols=['estado', 'data', 'casosAcumulados', 'obitosAcumulados'])
# 1 - Renaming all selected columns.
sus.columns = ['initials', 'date', 'cumcases', 'cumdeaths']
# 2 - Changing date column data type to datetime.
sus['date'] = pd.to_datetime(sus['date'])
# 3 - Extract May/10 portion.
sus.set_index('date', inplace=True)
sus = sus.loc['2020-05-10']
sus.reset_index(inplace=True)
# 4 - Merging geobr and sus data frames.
br = geobr.merge(sus, on='initials')
# 5 - Deleting date column.
del br['date']
# 6 - Checking the data.
br.head()
After the merging process, the br data frameis ready to next step. Let's run a statistical analysis using the function describe() to gather important metrics that will help in futher evaluations.
br.cumcases.describe()
br.cumdeaths.describe()
In order to make a dynamic map, we use the function folium.Map to initiate based on the center of my geographic regions. Brazil is located at latitude -14.235004 and longitude -51.92528 (https://www.geodatos.net/en/coordinates/brazil) and it is part of South America in the southern hemisphere. So, we're going to set our center using this values.
# Defining coordinates of where we want to center our map
c = [-14.235004, -51.925282]
#Creating the map
cumcasesmap = folium.Map(width= 600, heigth= 400, location = c, zoom_start = 4, max_zoom= 5, tiles= 'cartodbpositron')
Colormap is some sort of layer to place colors in our geographic regions.
First, we need to use branca LinearColormap to create a colormap, which is a linear interpretation of two or more colors. The branca colormap can be created based of tuples of RBGs or shortcuts. In my map, I’ll color from white to purple using an array with these colors. As its name implies, the colormap maps colors to numbers so we need to set the endpoints of the map to the minimum and maximum of our variable. Did you remember when we run the describe() function in the previous step? Using min and max metrics will be able to set the required boundaries.
# Creating a Colormap for Cumulative Cases
colormap = LinearColormap(colors= ['white', 'lightblue', 'purple'],
index= [362, 4000, 45444], vmin=362, vmax=45444)
colormap.caption = 'COVID-19 Cummulative Cases May/10'
Now that we have our map initiated and our colormap is already set, we’ll use folium.GeoJson to add a layer to the map. Within this method we’ll
style_function to color the regions based on the colormap and variable valuehighlight_function and tooltip.folium.GeoJson(br,
name='10/05/2020',
style_function=lambda x: {'fillColor': colormap(x['properties']['cumcases']),
'color': 'black',
'fillOpacity':0.7,
'weight': 1},
highlight_function=lambda x: {'weight':1, 'color':'black', 'fillOpacity':1},
tooltip=folium.features.GeoJsonTooltip(fields=['state', 'initials', 'cumcases'],
aliases=['State:', 'Initials:', 'Cumulative Cases:'])).add_to(cumcasesmap)
colormap.add_to(cumcasesmap)
# Calling the object the holds our map.
cumcasesmap
In this part, we're going to repeat the same process described in the previous step using cummulative deaths instead of cummulative cases.
# Centering the map at the same coordinates "c" and iniciating our map.
cumdeathsmap = folium.Map(width= 600, heigth= 400, location= c, zoom_start= 4, max_zoom= 5, tiles= 'cartodbpositron')
# Creating a Colormap to Cumulative Deaths
colormap2 = LinearColormap(colors= ['white', 'pink', 'red'],
index= [11, 300, 3709], vmin= 11, vmax= 3709)
colormap2.caption = 'COVID-19 Cummulative Deaths May/10'
folium.GeoJson(br, name= '10/05/2020',
style_function= lambda x: {'fillColor': colormap2(x['properties']['cumdeaths']),
'color': 'black',
'fillOpacity': 0.7,
'weight': 1},
highlight_function= lambda x: {'weight': 1, 'color': 'black', 'fillOpacity': 1},
tooltip=folium.features.GeoJsonTooltip(fields= ['state', 'initials', 'cumdeaths'],
aliases= ['State:', 'Initials:', 'Cumulative Deaths:'])).add_to(cumdeathsmap)
colormap2.add_to(cumdeathsmap)
# Calling the object the holds our map.
cumdeathsmap